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Abstract 

A new paradigm of gene expression regulation has emerged recently with the 
discovery of microRNAs, an evolutionarily-conserved class of ~22 nucleotide (nt) RNAs. 
miRNAs control gene expression by base pairing with miRNA-recognition elements 
(MREs) found in their messenger RNA (mRNA) targets. Despite a large number of 
reported miRNAs their mRNA targets remain elusive. Here we use a combined 
bioinformatics and experimental approach to identify important rules governing miRNA- 
MRE recognition that allow prediction of human and mouse miRNA targets. We predict 
mRNA targets for human and mouse miRNAs and provide a strategy to identify mRNA 
targets for all known miRNAs. 
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miRNAs arc derived from endogenous genes that are initially transcribed as 
longer RNA transcripts{Lee l 1993 #23} {Wightman, 1993 #39} {Reinhart, 2000 #34} 
{Lagos-Quintana, 2001 #17} {Lau, 2001 #21} {Lee, 2001 #22} {Mourelatbs, 2002 #30}. 
These transcripts are processed in the nucleus into ~75nt precursors (pre-miRNAs) that 
fold as single stem-loop structures {Lee, 2002 #24}. Pre-miRNAs are exported to the 
cytoplasm where the nuclease Dicer excises the mature miRNAs {Hutvagner, 2001 #12} 
{Lee, 2002 #24}. miRNAs are bound to proteins that belong to the Argonaute family and 
assemble with other proteins, including the Gemin3 and Gemin4 proteins, to form micro- 
Ribonucleoprotein complexes (miRNPs) {Mourelatos, 2002 #30}. Dicer also processes 
another class of ~22nt RNAs termed short interfering RNAs (siRNAs) {Elbashir, 2001 
#7} {Hamilton, 1999 #9} from double stranded RNAs {Bernstein, 2001 #3}. Like 
miRNAs, siRNAs are bound to Argonaute proteins {Hammond, 2001 #11} {Martinez, 
2002 #28} and assemble with additional proteins to form RNA-Induced Silencing 
Complexes (RISCs) {Hammond, 2001 #11}. siRNAs base-pair with their target RNAs 
with near perfect complementarity and direct target RNA endonucleolytic cleavage 
{Elbashir, 2001 #7}. This is the mechanistic basis of mRNA destruction in RNA 
interference {Fire, 1998 #8}. miRNAs function by base pairing with miRNA-recognition 
elements (MREs) found in their mRNA targets. The critical determinant of miRNA 
function is the degree of complementarity between a miRNA and its RNA target: if the 
complementarity is extensive (as is the case with plant miRNAs) the RNA target is 
cleaved (and the miRNA functions essentially as an siRNA) {Hutvagner, 2002 #13} 
{Rhoades, 2002 #35} {Llave, 2002 #27} {Tang, 2003 #37} {Xie, 2003 #40}; if the 
complementarity is partial, translation of the target mRNA is repressed {Olsen, 1999 
#31}{Seggerson, 2002 #36}{Zeng. 2002 #48}{Doench, 2003 #47}. In plants, the 
computational identification of miRNA targets was facilitated by the extensive 
complementarity between plant miRNAs and their mRNA targets {Rhoades, 2002 #35}. 
Two plant miRNA targets (mir-171 and miR-162) have been verified experimentally 
{Llave, 2002 #27} {Xie, 2003 #40}. Two mouse miRNAs (miR-127 and miR-136) also 
show perfect antisense complementarity with the coding region of a retrotransposon-like 
gene (Rtll) {Seitz, 2003 #44}. However, most animal miRNAs are thought to recognize 



4 



and attenuate the translation of their mRNA targets via partial antisense complementarity 
{Lee, 1993 #23} {Wightman, 1993 #39} {Moss, 1997 #29} {Reinhart, 2000 #34} 
{Olsen, 1999 #31}{Zeng, 2002 #48}{Doench, 2003 #47}. Because of this partial 
complementarity, simple homology-based searches have failed to uncover targets for 
miRNAs in organisms other than plants ({Rhoades, 2002 #33} and our unpublished 
data). The few animal miRNA targets that are known, have been identified in genetic 
screens. In particular, genetic dissection of the heterochronic gene pathway in C. elegans 
identified the lin-14 and Un-28 mRNAs as targets for the lin-4 miRNA {Lee, 1993 #23} 
{Wightman, 1993 #39} {Moss, 1997 #29}, and the /in-47 mRNA as a target for the let-7 
miRNA {Reinhart, 2000 #34}. Importantly, these studies demonstrated that individual 
MRE sequences are necessary and sufficient to confer miRNA-dependent gene 
expression regulation in MRE-bearing target mRNAs {Moss, 1997 #29} {Reinhart, 2000 
#34}, In Drosophila, the novel bantam miRNA regulates the pro-apoptotic gene hid 
{Brennecke, 2003 #4}, while the human miR-23 regulates the Hesl transcription factor 
{Kawasaki, 2003 #14}. Putative MREs for other miRNAs have been proposed {Lai, 
2002 #20}{Xu, 2003 #41}{Lin SY, 2003 #25}{Abrahante JE, 2003 #1}, but these are 
predominantly based on visual inspection of putative mRNA targets for partial 
complementarity with miRNAs and lack experimental verification. The rules guiding 
miRNA:MRE interactions are unknown, making prediction of miRNA targets virtually 
impossible. 

To search for human miRNA targets we employed a bioinformatics approach. We 
limited our searches to the 3'-UTRs of human mRNAs, extracted from the annotated 

Reference mRNA Sequences (RefSeq) database (comprising a total of base pairs) 

{Pruitt, 2003 #43}. Repetitive elements, such as Alu transposable elements that are 
embedded in a random fashion in -5% of all human mRNAs, were filtered out before 
running the searches. We used six miRNAs (let-7b, miR-141, miR-24, miR-145, miR-23a 
and let-7e) all of which are 100% conserved between humans and mice. Because there 
were no known examples of human miRNA:MRE interactions, we could not follow the 
typical computational approach which is to use known examples to formulate and train an 
algorithm. Instead, we hypothesized that high affinity interactions, based on binding 
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energies between two RNAs paired imperfectly, might identify miRNA:MRE 
interactions. We implemented a modified dynamical programming algorithm that was 
calculating free energies of both canonical (Watson-Crick) and G-U wobbles base 
pairs{Tinoco, 1973 #50}. To identify putative MREs, we used a window of 35nt that was 
"sliding" over the 3'-UTR database and calculated the minimum binding energy between 
the miRNAs and sequences in the 3 V -UTR database. Mismatches were allowed and hits 
were sorted by lowest binding energies. This analysis revealed _ hits. Because all 
miRNAs used for the searches are 100% conserved between humans and mice, it is likely 

that their mRNA targets, and hence their MREs, are also conserved. hits were found 

to be conserved between human and mice. Visual inspection of the data showed that one 
of the predicted MREs for let-7b was found in the 3'-UTR of both the human and mouse 
mRNAs that code for the human/mouse homolog of the C. elegans LIN-28 protein, a 
putative RNA-binding protein (see Figure 1). This finding is particularly interesting 
because the lin-28 and let-7 genes function in the same developmental pathway in C 
elegans{Moss, 1997 #29}. 

Because MREs are necessary and sufficient to confer mi/siRNA-dependent 
translational repression{Moss, 1997 #29} {Reinhart, 2000 #34}, we reasoned that 
placement of predicted MREs for specific miRNAs in the 3MJTR of a reporter construct, 
followed by transfections in cells expressing the miRNAs that recognize the MREs, 
should lead to a decrease of the reporter protein levels. We cloned the predicted LIN-28 
MRE in the 3'-UTR of a Renilla Luciferase (RL) reporter construct. As a positive control, 
we generated two RL constructs each bearing in the 3'-UTR one of the two reported 
MREs for let-7, derived from the C. elegans LIN-41 mRNA, an experimentally verified 
let-7 target {Reinhart, 2000 #34}. As a negative control, the sequence of the LIN-28 
MRE was scrambled and placed in the 3'-UTR of RL. We cotransfected the RL-MRE 
bearing constructs along with a plasmid encoding Firefly Luciferase (FL) in two different 
cells lines: HeLa cells, a human epithelial cell line and MN-1 cells, a mouse motor 
neuronal cell line. These cell lines normally express all let-7 paralogs which are 100% 
conserved between humans and mice ({Lagos-Quintana, 2001 #17} {Dos tie, 2003 #6} 
and our unpublished data). 18 hours after transfection we quantitated the levels of 
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normalized RL/FL using standard luminometric assays. As shown in Figure 1 we 
consistently observe a -5-fold reduction in the protein levels of RL bearing the LIN-28 
MRE versus RL bearing the scrambled MRE (negative control), an effect which is 
stronger than that of the two positive control MREs derived from LIN-41 (LIN-41 a and 
LIN-41b, Figure 1). Similar results were obtained with both cell lines when the 
luminometric assays were performed 16, 24 or 48 hours after transfections (our 
unpublished data). These results confirm the validity of the predicted LIN-28 MRE. 

We next wished to determine which of the remainder MREs represented true 
miRNA targets by investigating the rules that guide miRNA:MRE binding. We 
hypothesize that such binding is guided by miRNA-associated protein(s) that impose 
restraints on the position and sizes of loops and nucleotide bulges between miRNAs and 
their cognate MREs. miRNP proteins and in particular the Argonaute family of proteins 
represent excellent candidates for guiding such miRNA:target mRNA interaction. Indeed, 
in a human neuronal cell line, a Gemin4-Argonaute-let-7b containing miRNP associates 
with endogenous LIN-28 mRNA in polysomes (see accompanying Brevia manuscript). 
To determine the rules of miRNA:MRE binding we generated a series of mutant LIN-28 
MREs with varying binding properties between them and the human/mouse let-7b 
miRNA (Figure IB). The activity of these MREs was tested as described above, in HeLa 
and MN-1 cells. As shown in Figure IC, with the exception of LIN-28-M3 MRE, single 
nucleotide bulges between let-7b and the LIN-28 mutant MREs that map towards the 5'- 
end of let-7b, abolish repression of RL expression (mutants LIN-28-M1, -M2, -M4, -M5 
and -M6). The single nucleotide bulge of LIN28-M3 MRE is symmetrically placed 
between the beginning of the loop and the beginning of base pairing between the 5 f -most 
let-7b nucleotide with LIN-28-M3. A similarly placed single nucleotide bulge is found 
between let-7a and LIN-41 a (one of the two LIN-41 MREs, present in the 3'-UTR of the 
C. elegans lin-41 mRNA; Figure 1). The activities of both of these MREs are similar 
(compare LIN-41 a to LIN-28, M3 in Figure 1C). These results show that near perfect 
complementarity between the first -9 nt (from the S'-end) of a miRNA and its cognate 
MRE is required for miRNA function; and the 5' most nucleotide of miRNAs may or may 
not base pair with MREs (see bindings between UN-41a or LIN-41b MREs with let-7a in 
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Figure IB). We refer to this region of miRNA:MRE binding as the proximal region. 
Careful analysis of published work on si/miRNAs, provides further support for this 
claim: a. In the experimentally verified MREs for lin-4 and leh7 9 there is perfect base 
pairing between the MREs and the first seven or eight (starting from the S'-end of the 
miRNA) nucleotides of each miRNA with none or only a single symmetrically placed 
nucleotide bulge; and the 5' most nucleotide of lin-4 and let-7 may or may not base pair 
with MREs {Moss, 1997 #29}{Reinhart, 2000 #34}. b. Two loss-of-function mutants of 
lin-4 and let-7 miRNAs, identified in genetic screens, are caused by single-point 
mutations mapping in the first 6 nucleotides in both miRNAs and are predicted to disrupt 
base pairing in the proximal region{Lee, 1993 #23}{Moss, 1997 #29}{Reinhart, 2000 
#34}. c. The 5 '-end of siRNAs sets the ruler for target RNA cleavage, implying that 
recognition of the 5*-end of siRNAs is essential for their function{Elbashir, 2001 #7}. d. 
A genetic, single-point mutation, present in the MRE of the Arabidopsis PHAVULOTA 
(PHV) mRNA, disrupts base pairing with the fifth nucleotide of its cognate miR-165/166 
miRNA and reduces dramatically the miR-165/166-mediated cleavage of the mutant phv 
raRNA{Tang, 2003 #37}. e. Single point mutations mapping in the first 7 nucleotides of 
an siRNA abolish siRNA activity, whereas point mutations mapping towards the 3'-end 
of the siRNA have no or much smaller effects{Amarzguioui, 2003 #2}. 

In contrast to the strict requirements for base-pairing at the proximal region, 
nucleotide bulges between LIN-28 mutant MREs and the 3'-end of let-7b, (a region that 
we refer to as the distal region), are tolerated and decrease by -2-fold the activities of the 
mutant LIN-28 MREs (Figure 1; LIN-28-M7, -M8 and -M9). The activity of LIN-28- 
M10, which bears a single nucleotide mismatch away from the loop and close to the 3'- 
end of let-7b is essentially the same with that of the wild-type LIN-28 MRE. We next 
determined the requirements for the size and position of the loops between let-7b and 
mutant LIN-28 MREs. The optimal loop length found in the wild-type LIN-28 MRE is 
5nt. As shown in Figure 1, LIN-28 mutant MREs with single, symmetrically placed loops 
varying in size from 2nt to 4nt were still active (LIN-28-M12 to -M14), while a single 
nucleotide substitution of the LIN-28 loop had the same activity as the wild-type LIN28 
MRE (LIN-28-M11). However, LIN-28 mutant MREs with loops longer than 5 
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nucleotides were unable to repress the Renilla luciferase activity (UN-28-M15, -M16). 
Finally, mutant LIN-28 MREs were designed that allowed for a single let-7b loop of 
varying sizes. As shown in Figure 1, MREs with a 9nt or 7nt let-7b loop were active 
(UN-28-M17, -Ml 8) but MREs with a let-7b loop of less than 5nt were inactive (LIN- 
28-M19 to -M21). In fact the activity of LIN-28-M18 MRE is identical to the wild-type 
LIN-28 MRE f and resembles the binding characteristics between the C. elegans lin-4 
miRNA and its /m-25 mRNA target{Moss, 1997 #29}. 

These experiments demonstrate that there are indeed rules that govern 
miRNA:mRNA interactions, which may be generally applicable. These rules are 
summarized in a schematic form in Figure 2. We note that the repressing properties of a 
miRNA may depend on the way it interacts with its mRNA target. A miRNA:MRE 
interaction with a single nucleotide loop of optimal length is more potent (LIN-28, wt; 
Figure 1) than two small opposing loops (LIN-41b; Figure 1). This finding may explain 
the requirement, for optimal repression, of two MREs for leul in the S'-UTR of the C. 
elegans lin-41 mRNA{Reinhart, 2000 #34}. On the other hand, a single MRE for lin-4 in 
the 3 f -UTR of the C. elegans lin-28 mRNA suffices because it contains a single, 6nt 
loop{Moss, 1997 #29}. The degree of miRNA-mediated translation^ repression may 
ultimately dependent on additional factors such as the miRNA and target mRNA 
concentrations and the accessibility of MREs. The finding of a let-7b MRE in the 3'-UTR 
of the human/mouse LIN-28 mRNA and the fact that endogenous human LIN-28 mRNA 
associates with a let-7b-containing miRNP in polysomes (see accompanying paper) 
strongly indicate that human let-7b and UN-28 are part of the same pathway, which may 
be functionally related to the C. elegans heterochrony gene pathway. The C. elegans lin- 
28 mRNA is predominantly regulated by lin-4 {Moss, 1997 #29}. Although a direct role 
for let-7 in the regulation of C. elegans lin-28 mRNA has not been shown, lin-28 is also 
regulated by a lin-4 independent pathway {Seggerson, 2002 #36}. There are four let-7 
paralogs in C. elegans and it is possible that one of them regulates lin-28. 

We applied the miBRs to the putative MREs that our initial algorithm had predicted 
and identified five (in addition to LIN-28) that followed them. We tested these MREs 
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(Figure 3A) as well as others that did not abide to the miBRs using the luciferase based 
assay as described above. The cognate miRNAs (miR-141, miR-24, miR-145, miR-23a 
and let-7e) for these MREs are present in both HeLa and MN-1 cells{Lagos-Quintana, 

2001 #17}{Mourelatos, 2002 #30}{Lagos-Quintana, 2002 #19}{Dostie, 2003 #6} (and 
our unpublished data). miR-141 was originally cloned from mouse{Lagos«Quintana, 

2002 #19}. Human miR-141 containing two additional terminal nucleotides has also been 
cloned from human colonic mucosa and deposited in the Genbank database as miR-157 
(accession AJ535825). We have cloned miR-141 from HeLa and MN-1 cells and 
confirmed the presence of the two additional nucleotides as shown in Figure 3A (our 
unpublished data). The predicted MREs that did not abide by the miBRs, were unable to 
suppress the expression of luciferase (Table 1 and our unpublished data). In contrast, all 
MREs that abided by the miBRs, suppressed the expression of luciferase (Figure 3B). A 
MRE for miR-141 (miR-157) is present in the 3'-UTR of both human and mouse mRNAs 
coding for the CLOCK transcription factor, which is critical for circadian rhythms. Two 
identical MREs for miR-24 are present in the 3-UTR of both human and mouse mRNAs 
coding for Mitogen-Activated Protein Kinase 14 (MAPK14, also known as p38a kinase; 
Figure 3C). MAPK14 has pleiotropic cellular effects; it is a key regulator of stress- 
induced signaling, cell proliferation and apoptosis and is required for placental and heart 
development and erythropoiesis (see OMIM entry 600289). A MRE for miR-145 is found 
in the S'-UTR of both human and mouse mRNAs coding for a hypothetical 501 amino- 
acid protein termed FU21308 in humans and D13Ertd275e in mouse. FLJ21308 contains 
a putative poly (ADP-ribose) polymerase catalytic domain, suggesting that it may 
function in chromatin modification by ADP ribosylation. For let-7e and miR-23 
miRNAs, MREs were found in the 3'-UTR of human mRNAs coding for the structural 
maintenance of chromosomes 1-like 1 protein (SMC1L1) and a 324 amino-acid 
hypothetical protein termed FLJ 13158, respectively. SMCL1 functions in sister 
chromatid cohesion during mitosis (see OMIM entries 606462 and 300040). The 
FU13158 protein contains a 120 amino-acid domain of unknown function (termed the 
DUF738 domain), which is highly conserved in worm, fly, rodent and human proteins. It 
is tempting to speculate that the two hypothetical proteins whose expression is potentially 
regulated by miR-145 and miR-23, function in development. If their expression were 
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dcvclopmcntally restricted (as is the case for the C. elegans LIN-28, LIN-14 and UN-41 
proteins, whose mRNAs are regulated by miRNAs) it might explain why they have 
escaped detection. 

The FU13158 mRNA may be regulated by miR-23, a miRNA that was recently 
shown to play a role in the differentiation of human NT2 neuronal cells{Kawasaki, 2003 
#14}, In the same study miR-23 was reported to recognize a MRE within the coding 
region of the Hesl transcription factor mRNA{Kawasaki, 2003 #14}, The miR-23:Hesl 
interaction does not abide by the miBRs that we have established in this report; in 
particular there is no central loop and there are many nucleotide bulges or mismatches at 
the proximal region{ Kawasaki, 2003 #14}. Importantly, placement of a single Hesl 
MRE at the 3' UTR of a Renilla Luciferase construct did not affect the activity of 
luciferase{Kawasaki, 2003 #14}, exactly as our rules would have predicted. However, 
placement of five copies of Hesl MREs repressed the activity of luciferase{Kawasaki, 
2003 #14}. There are two possible mechanisms that may explain this discrepancy. Our 
validation assay for putative MREs utilizes a single MRE in the 3'-UTR of the RL 
reporter, which is under the control of the relatively strong Herpes simplex thymidine 
kinase promoter. We deliberately chose to insert a single MRE to avoid extraneous 
effects of longer sequences that may arise when multiple MRE copies are used. At the 
same time, our assay may not be sensitive enough to detect weaker miRNA:MRE 
interactions that may become apparent when multiple MREs are used. This may be true 
for miRNAs that are expressed at low levels or for low-affinity miRNA:MRE 
interactions. Many miRNAs are surprisingly abundant{Lim, 2003 #49} and we expect 
that a single MRE should suffice to detect high affinity miRNA:MRE interactions. This 
appears to be the case with the miR-23:FLJ13158 (Figure 3B), but not with the mir- 
23:Hesl interaction{Kawasaki, 2003 #14}. Another possibility is that there may exist 
additional rules governing miRNA:MRE interactions, especially those occuring within 
the coding regions of target mRNAs, as is the case for Hesl. Further work is required to 
address these issues and to determine how miRNAs recognize putative MREs in regions 
other than the 3'UTRs of mRNAs. Nevertheless, this study provides a strategy to predict 
RNA targets for all known miRNAs, and describes the first set of human miRNA targets. 
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We anticipate that these findings will facilitate the functional characterization of miRNAs 
and the genes that they regulate. 
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Figure Legends 
Figure 1 

Experimental verification of a predicted miRNA recognition element (MRE) and of the 
miRNA Binding Rules. A. Schematic representation of the reporter construct; red: coding 
region. B. Potential base pairing between predicted MREs derived from the indicated 
mRNAs (black) and their cognate miRNAs (blue); wt: wild-type sequence. Nucleotides 
forming potential bulges between LIN-28 MRE mutants and let-7b miRNA are shown in 
red. C. HeLa human cells (blue bars) or MN-1 mouse cells (orange bars) were 
cotransfected with Renilla Luciferase (RL) constructs bearing the indicated MREs in the 
3*-UTR, along with Firefly Luciferase (FL). Results shown are average values (with 
standard deviations) of normalized RL/FL activities obtained from six separate 
experiments. 



Schematic representation of miRNA:MRE bindings. Blue: miRNAs; red: MRE. P: 
Proximal (relative to 5'-end of miRNA) region of miRNA:MRE binding; D: Distal region 
of binding; L: loop. A. Double opposing loops; loop length = 2 to 3nt. B. Single MRE 
loop; length = 2 to 5nt. C. Single miRNA loop; length 6 to 9nt. Proximal binding 
characteristics: £ 7nt base pairing between miRNA and MRE; the 5' most nucleotide of 
the miRNA may or may not base pair with MRE; one symmetric nucleotide bulge 
allowed. Distal binding characteristics: £ 5nt base pairing between miRNA and MRE. 
Nucleotide bulges allowed. In A and B the last (towards the 3'-end) nucleotides of the 
miRNA may or may not base pair with the MRE. 



Predicted miRNA targets. A. Potential base pairing between predicted MREs derived 
from the indicated mRNAs (black) and their cognate miRNAs (blue). Accession numbers 



Figure 2 



Figure 3 
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(Human/Mouse): LIN-28 (NM.024674/ NMJ45833); CLOCK (AF011568/ 
NMJ)07715); MAPK14 (NM.139012/ NMJH1951); FLJ21308 (NM.024615/ 
BC021315); FU13158 (NM_024909); SMCL1 (NMJJ06306). Numbers refer to 
nucleotide positions based on the human mRNAs. B. HeLa (blue bars) or MN-1 (orange 
bars) cells were cotransfected with Renilla Luciferase (RL) constructs bearing the 
indicated MREs in the 3*-UTR, along with Firefly Luciferase (FL). Results shown are 
average values (with standard deviations) of normalized RL/FL activities obtained from 
six separate experiments. C. Schematic representation of MAPK14 MREs bound to miR- 
23. Red: 5'-UTR and coding region. 
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